perm filename APP2[AIM,DBL]1 blob
sn#121227 filedate 1974-09-24 generic text, type T, neo UTF8
00100 .DEVICE XGP
00200 .FONT 1 "FIX25"
00300 .FONT 2 "SIGN57"
00400 .FONT 3 "SHD40"
00500 .FONT 4 "BDI25"
00600 .FONT 5 "NGB30"
00700 .FONT 6 "NGR20"
00800 .TURN ON "↓_π{"
00900 .TURN ON "⊗" FOR "%"
01000 .MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
01100 .MACRO E ⊂ APART END ⊃
01200 .TABBREAK
01300 .COMPACT
01400 .EVERY FOOTING(⊗6First Draft .... {DATE},page A2.{IF PAGE = 1 THEN 1 ELSE PAGE}⊗*,)
01500 .EVERY HEADING(⊗4Synthesis of Large Programs⊗*,⊗3BEINGS⊗*,⊗4Doug Lenat⊗*)
01600 .COUNT PAGE PRINTING "1"
01700 .NEXT PAGE
00100 .SELECT 1
00200
00300 ⊗2APPENDIX 2.⊗* ⊗3THE BEINGS⊗*
00400
00500 Here we present summaries of the knowledge embedded in each BEING. Only the
00600 most important parts of each BEING are even mentioned. First we present those
00700 BEINGS used to write CF. Next come the ones we had to add to the pool to get
00800 PUP6 to write GI and AIR. Among the first group, we further subdivide it into
00900 (a) high-level, domain-specific, (b) low-level domain-specific, (c) ubiquitous,
01000 problem-independent, (d) high-level, problem-independent, (e) low-level,
01100 problem-independent, (f) non-BEING knowledge in variables, and (g) a few
01200 interesting demons and functions. The additions following CF are so small we
01300 don't subdivide their descriptions.
01400
01500 .SELECT 5
01600
01700 (i) The knowledge necessary to write a concept formation program:
01800
01900 A. High-level, domain-specific knowledge
02000
02100 .SELECT 1
02200
02300 ⊗4CONCEPT-FORMATION⊗*
02400 The user must be aware that we are about to undertake concept formation.
02500 Inference and attention-focussing demons must be turned on.
02600 After completion of this task, PUP will be able to learn concepts.
02700 This is a specialized form of attending, learning, and doing inductive
02800 inference. It is an alternative to grammatical inference, pattern
02900 recognition, and simulated evolution. Its structure must be one of the
03000 following: classificatory concept formation, comparative concept formation,
03100 or metrical concept formation. We must make the boolean decision as to
03200 whether or not concepts may vary with time. Similarly, whether the speed
03300 of presentation of the stimuli is relevant; if so, then we must constrain
03400 the effort spent on various phases of identification. Instances may be
03500 left in view indefinitely, or may be removed right after processing;
03600 this latter case holds for CF; it means we must derive all relations
03700 (features) as soon as we see a scene. The program will have to be just
03800 complex enough to handle conjunctive, disjunctive, or both kinds of
03900 concpts; this is another decision to make. Similarly for positive,
04000 negative, both, or neither kinds of transfer (psychological), which
04100 affects the recognition that a concept is new, and how previously
04200 learned concepts interact with the learning of new ones. We must whether
04300 to use positive, negative, or boht kinds of instances of a concept.
04400 Subject-specific behavior may be required; that is, the program may
04500 have to input a vector describing a particular individual, and its
04600 whole structure must mimic this subject. The last decision is one of
04700 adapting the program to an extended sample dialogue which the user must
04800 furnish; this will help both to check out the program writtten, and to
04900 fix various print statements and I/O formats. It is easy to call this
05000 being (.1), it has a 50-50 chance of calling* itself, it has only a
05100 0.5 chance of succeeding, but the effort to try it is moderate (.5).
05200 There is no fundamental reason for delaying its investigation (.1).
05300 It recognizes itself only through exact matching of one of seven
05400 patterns. It has sentences describing what it does, how, and why.
05500 It is unlikely (-70) to be called if some type of concept formation
05600 is already doable by PUP; if PUP wants to characterize classes, then
05700 it's very likely (88) to be called. The presence of new information
05800 delays (-60) our calling the being, since it might affect what we
05900 should do. Conversely, the absence of new information mildly (40)
06000 encourages us to go on and try it. When finished, the user must be
06100 aware that PUP has decided on a particular type of concept formation,
06200 and that it has done it. The other beings affected depend on the
06300 decisions mentioned earlier.
06400
06500 This is an over-abundance of information. From now on, I will
06600 only give the little pieces of information which are crucial to the
06700 writing of some part of the CF system.
06800
06900 ⊗4CLASSIFICATORY CONCEPT FORMATION⊗*
07000 To do this, we must partition a domain, in an interactive "guessing"
07100 manner.
07200
07300 ⊗4COMPARITIVE CONCEPT FORMATION⊗*
07400 Same as above, but then we must partially order the equivalence classes
07500 of the partition. It is harder, also.
07600
07700 ⊗4METRICAL CONCEPT FORMATION⊗*
07800 Same as previous being, but we must also induce a metric on the
07900 partial ordering of the classes. This is even more complicated.
08000
08100 Since we actually do classificatory CF, the beings to order
08200 a partition and to metrize an ordering weren't implemented.
08300
08400 ⊗4PARTITION A DOMAIN⊗* in a guessing, interactive manner.
08500 The partition may be only partial, it may be only weak, and, most
08600 crucially, the being must be able to do some of these: partition by
08700 accepting an element and getting its class name (guessing the name
08800 and then checking it somehow), partition by accepting a class name
08900 and getting its member elements, partition by accepting ordered
09000 pairs <element, classname>. The fringe of conciousness demon must
09100 be activated from now on.
09200
09300 ⊗4PARTITION BY TAKE ELEMENT GET CLASS⊗*
09400 Take hold of an element (by reading, for example), and then work to
09500 get the name of the class it belongs to (by guessing, then verifying
09600 the guess, for example). Then modify the structure of the class(es)
09700 involved.
09800
09900 ⊗4PARTITION BY TAKE CLASS GET ELEMENT⊗*
10000 Take hoold of a class name, and then work to get elements it contains.
10100 Then modify the structure of the class and the element(s) involved.
10200
10300 ⊗4PARTITION BY TAKE ELEMENT AND CLASS⊗* simultaneously.
10400 Take hold of both and element and its corresponding class name, and
10500 use these to modify the structure of the partition; i.e., the class
10600 mentioned if the partition is stored by classes.
10700
10800
10900 ⊗5B. Low-level, domain-specific knowledge⊗*
11000
11100 ⊗4RECOGNIZE CONTRADICTION⊗*
11200 It can translate (... is incompatible with ...). It is a predicate,
11300 fairly easy to write therefore. It is composed of SOMEOF the following:
11400 Probability≡0 contradiction, Probability≡1 contradiction, Probability
11500 >0 and <1 contradiction. Since these names are fairly cryptic, some of
11600 their parts (e.g, HOW) must be printed out to help the usr choose.
11700
11800 ⊗4PROBABILITY ≡ 0 CONTRADICITON⊗*
11900 Since this is a very simple thing in our domain of concept formation,
12000 it is immediately encodable as (MEMBER arg1 arg2). That is, if arg1
12100 has probability zero of occurring in arg2, yet it does, then we have
12200 a contradiction.
12300
12400 ⊗4PROBABILITY ≡ 1 CONTRADICTION⊗*
12500 Immediately encodable as (NOT (MEMBER arg1 arg2)). If arg1 has probability
12600 one of occurring in arg2, yet it doesn't, then we have a contradiction.
12700
12800 ⊗4PROBABILITY >0 AND <1 CONTRADICTION⊗*
12900 Immediately enacodable as NIL. If arg1 might and might not occur in arg2,
13000 we can't get a contradiction just by checking its membership. Of course,
13100 the idea that this is the only way to prove contradiction is what makes
13200 these beings domain-specific!
13300
13400 ⊗*SCENE⊗*
13500 This is a data structure, composed of four subparts. The first is
13600 a set O of objects. Next is an atom indicating the name N of the scene.
13700 Next come two lists of features, where a feature is just a predicate
13800 and its arguments. The first is the static relations S between objects.
13900 Finally we have the dynamic relations D between objects.
14000
14100 ⊗5C. Ubiquitous, problem-independent beings and functions⊗*
14200
14300 ⊗4CHOOSE FROM⊗*
14400 All its arguments must be beings, else it prints a nasty warning message.
14500 We select the best being and apply it. If it fails, we re-order the
14600 remaining beings and apply the best one of them. Note that this new
14700 reordering may use knowledge gleaned during the earlier, unsuccessful
14800 being try. Thus, this is a (possible) intelligent nondeterministic
14900 branch point. The intelligence lies (or fails to lie) in the comparison
15000 function, BETTER, which decides who goes next.
15100
15200 ⊗4SATISFY⊗*
15300 This is the equivalent of a pattern-directed goal statement. We ask
15400 each being, "Can you do anything matching x?". We take the list of those
15500 answering affirmatively, and CHOOSE FROM it one being after another
15600 until the desired effects are realized. Notice that a being who said
15700 "probably" may succeed in his application and yet not effect the result
15800 we wanted, so that a trivial call on CHOOSE FROM is insufficient.
15900 The beings possible affected are just those answering affirmatively.
16000
16100 ⊗4MESSAGE⊗*
16200 This being has a main effect (AWARE USER x), hence is very frequently
16300 called. The forgetful-user demon trims the aware user list periodically.
16400 Message looks to see if its message is on that list; if not, it inserts it
16500 and prints it out to the user. If it is, it moves the message to the front
16600 of the aware-user list and prints out nothing.
16700
16800 ⊗4DETERMINE ARG VALUES⊗*
16900 These functions take as input the name of a function, and output a
17000 description of what arguments it will ever be called with (in the existing
17100 code.) For example, it might say "arg1 will always be NAME:OF:CLASS, and
17200 arg2 will consecutively be all integers from 3 to (LENGTH SET:OF:CLASSES)".
17300 At present these work in the obvious way, looking at everything. The
17400 tremendous amout of CPU time spent in these functions indicates that I
17500 should have made special assert:lists for argument instantiations, and
17600 updated them each time a being is called in the target code.
17700
17800 ⊗4FLOW-PRECEDED⊗*
17900 This being must search through he code to find a form matching a given
18000 pattern. Although it is used under ten times in the total dialog, it is
18100 so costly that I've implemented it as an ask-the-user call. Work must be
18200 done here to understand why this is so inefficient, and to remedy it.
18300
18400 ⊗4FIND AND TAG⊗*
18500 This being is similar to flow-preceded, except that the pattern-matching
18600 is between two constant strings. This is tolerably efficient in CPU time
18700 and is used heavily throughout the writing of CF.
18800
18900 ⊗4TRANSLATE⊗*
19000 The natural language front end is managed by this being. It asks each being
19100 whether it recognizes a given string. Translate then takes the "best" --
19200 the most probable -- of these as the translation, and can backtrack and
19300 reorder the remaining interpretations if it has to. If called with no
19400 argument, it examines various assert-lists to see if it can do any good.
19500 The idiom demon must be activated during the control period of this being.
19600
19700 ⊗4REINVESTIGATE DECISION⊗*
19800 This is usaully called by a demon who watches the deferred-decision
19900 assert list. We transfer the decision in question from the deferred to
20000 the undeferred decision assert list. A deferral demon will promptly
20100 react to anything on this latter list. An interesting caution: it was
20200 necessary to inhibit all demons during the execution of this being, for
20300 reasons very similar to those leading to lock and unlock system commands.
20400 THe fact that some being might have to be demon-uninterruptable forced us
20500 to institute an entire new question asking just about this tiny point!
20600
20700 ⊗4DEFER DECISION⊗*
20800 Remove the decision from the undeferred decision assert list.
20900 Determine the situation when we must next reinvestigate this decision.
21000 This will be some predicate examing the state of the world. If this
21100 predicate is true currently, we must resolve the decision now. Otherwise,
21200 we put the decision on the deferred decision list, attached to its (new)
21300 reinvestigation predicate. Demons must be inhibited during this being's
21400 reign, to ensure that it's notions of the world are accurate upon exit.
21500
21600 ⊗4WHEN NEXT⊗*
21700 Manipulate the decision to extract the name of the variable holding
21800 information relevant to deferring the decision. Utilize this knowledge
21900 so as to keep the effects of the decision irrelevant; i.e., find the
22000 (next) situation in which they are not irrelevant. Whoever called
22100 this being is now asserted to be aware of its results. THis is fairly
22200 complex, and the being is not called unless it is necessary. As it happens,
22300 it is called a few times for every decision to be made about every being
22400 in the target program (several hundred times).
22500
22600 ⊗4UTILIZE⊗*
22700 This being applies various knowledge "variables," starting at specific
22800 ones and moving toward very general ones, until one of them reports
22900 being able to acheive the desired goal.
23000
23100 ⊗4RESOLVE DECISION⊗*
23200 Again, all demons must be inhibited. After some preliminary searching
23300 and very trivial theroem-proving fail, this being resorts to asking the
23400 user about how to resolve a given decision.
23500
23600 ⊗4ASK USER ABOUT⊗*
23700 We determine the argument instantiations of the little piece of code
23800 we're worrying about, determine the type of decision to be made, and
23900 apply the specific knowledge variable for x-ing that type of decision.
24000 Here, we get x by examing who called this being and why. To write a
24100 specialized version of ask-user-about, we just write a standard
24200 print, read, and assign function, with the details left unspecified
24300 until the sample session is read in.
24400
24500 ⊗4BETTER⊗*
24600 This function is used to compare two beings, and decide which of them
24700 should gain control. It evaluates their WHEN parts, and if they tie
24800 it evaluates their complexity vectors. Note that "eval" here is not
24900 trivial: each dimension of the complexity vector of a being can be a
25000 little program which examines itself, other beings, and the state of the
25100 world before deciding on a numerical answer to rEturn.
25200
25300 ⊗5Handling of User Interrupts⊗*
25400 There are several functions and beings involved in this process.
25500 Initially, the user describes how often the system is to give him the
25600 opportunity to interrupt and query it. At each of these times, the
25700 HANDLE USER INTERRUPT function asks the user if he wants to interrupt;
25800 if so, PROCESS USER INTERRUPT is called to do the job. In addition to
25900 asking for pieces of any being, the user may request limited simulated
26000 execution of various pieces, and may order the current being to FAIL.
26100
26200
26300
26400 ⊗5D. High-level, problem-independent knowledge: how to write programs⊗*
26500
26600 ⊗4SERVE⊗*
26700 Obtain information until some of it is "executable," then carry it out.
26800 The forgetful-user demon is activated. Without this top-level purpose,
26900 PUP5 sits contentedly, never wanting to accept a new task.
27000
27100 ⊗4WRITE PROGRAM⊗*
27200 The user must be made aware that PUP is about to write a program, what
27300 kind of program it is, what its name is (this will force a get-name
27400 being call), and that its type has been examined (this will cause a
27500 study-type being call). Upon exit, he must be told that PUP has completed
27600 the task, and what its new capabilities are. To wite a program, one enters
27700 a loop, broken only when several completion conditions are all true
27800 simultaneously: the top-level task is now a being, there are no undefined
27900 sections of code, there are no warnings left about the code, there is
28000 no executable information anywhere, ther is no new but unprocessed
28100 information, there are no decisions still pending (except those requiring
28200 "everything else" to be complete; e.g., the adaptation ofoutput formats
28300 using a sample session). If we do break out of the loop, we must update
28400 the list of programs written, the list of what PUP can now do, of what
28500 the user may do, we find the set of support of the top-level task and
28600 create a new file with the relevant functions and beings (which auto-
28700 matically does global initializations and then enters the top-level
28800 task instead of SERVE). In general, of course, we won't break out,
28900 so we activate all the current demons and go on. All the body of the
29000 loop is is one CHOOSE FROM, between six alternatives: obtaining some
29100 usable info, using some usable info, filling in some function call which
29200 is currently undefined, clarifying some little piece of code known to
29300 be improbable for some specific reason, adapting some known function
29400 to conform to some specific new requirements, fixing some piece of code
29500 which doesn't work the way it claims to work. The last two of these are
29600 simply program modification and debugging, respectively! Failure of one
29700 of these six beings simply causes CHOOSEFROM to try another one; failure of
29800 a demon causes the whole WRITE PROGRAM being to fail. During its reign,
29900 the program-writing demons, deferral-demon, ad reinvestigation-demon are
30000 all activated. Its complexity vector is dependent upon that of the
30100 being closest to the task it must perform.
30200
30300 ⊗4OBTAIN USABLE INFORMATION⊗*
30400 The WHEN part informs us that this is always undesirable (-10) but
30500 is OK (111) if there exists new but not yet usable information. All
30600 we do here is CHOOSE FROM the following four alternatives: translate
30700 something, get brand new information, analyze the implications of
30800 existing information, extract a small relevant subset of the existing
30900 information to concentrate on.
31000
31100 ⊗4USE INFORMATION⊗*
31200 This demands that some executable information exist. We select one such
31300 piece, and try to execute it. If we fail, its worth is decreased; if we
31400 succeed, it is removed from the executable info assert list.
31500
31600 ⊗4FILL IN UNDEFINED SECTION⊗*
31700 THere must be some undefined section. If so (80) we don't want any
31800 hi-priority (≤20) coding warnings still around (-150), and we do like
31900 there to be something both undefined and known to be encodable (96).
32000 We fix a choice of what to encode, and try to acheive its encoding.
32100 If we fail, we update the difficulty of that choice, and may assert that
32200 we want some specific new information to relieve the problem. In addition
32300 to ENCODE, this being also may call MAKE ENCODABLE and STUDY TYPE.
32400
32500 ⊗4CLARIFY IMPROBABLE SITUATION⊗*
32600 This being demands that something of mediocre priority (≤500) exist on
32700 the coding warning assert-list. It likes (51) this, and dislikes (-41)
32800 anything on the undefined section list, or anything (-200) on the encodable
32900 section list. As always, a sentence is provided to justify each of these
33000 little beliefs. We choose the warning with the highest priority (lowest
33100 numerical weight) on the coding warning list, note that that is what PUP
33200 is working on now, and do a match to decide what type of warning it is.
33300 (i) Replace x by y in z
33400 Here these may be nonspecific; z may be "in all code recently generated".
33500 The nature of y may cause us to include new warnings; y may mention a new
33600 data structure.
33700 (ii) x in y is undefined; probably z since r
33800 This may cause us to add to the global initialization list. It will
33900 probably cause us to ask the user what the answer is.
34000 (iii) x is a data structure but we don't know much about it
34100 We try to find out its structure, how to initialize, access, insert,
34200 delete, update it. A variant of this warning is:
34300 (iv) We find no x's associated with data structure y
34400 Here x can point to insertions, deletions, initializations, etc.
34500 If we can't justify the lack, we try to defer the decision. Failing
34600 that, we ask the user.
34700 (v) Command if x then y
34800 This is a programmed demon; when situation x is true, we must do y.
34900 (vi) Delete all mention of x
35000 This is like a replace, but we go through the assert lists with an eye
35100 toward deleting unnecessary worries.
35200 (vii) Infinite loop in x from y to z
35300 If we can't justify this, we insert a test to break out of the loop.
35400 Justification might be that this loop is in the top-level function of
35500 the system, where we never wnat to break out.
35600 (viii) Incomprehensible (there is a "bug" in x manifesting itself as y)
35700 Never needed to write CF, so not implemented. SHould call FIX INCORRECT
35800 PIECE (which is also not in yet) or ask the user for assistance.
35900
36000 ⊗4GET NEW INFORMATION⊗*
36100 Naturally, it is not thrilled if (-68) there exists some new but
36200 unexamined information, and it is happy (50) if there is none.
36300 The prerequisites ensure that the user is aware of what PUP wants,
36400 and if the theorem prover can't deliver it, PUP asks the user for some.
36500 If PUP asks for something general ("any task") it is because it knows
36600 precisely that this is what it wants, not out of ignorance! During
36700 execution, the specificity check demon is active; he ensures that
36800 it is indeed phrased as specifically as possible; if not, MAKE SPECIFIC
36900 will be called. This is a very uncomplicated being, and a very unpopular
37000 one to use since we should squeeze every drop of meaning out of what
37100 we have before asking for more information.
37200
37300 ⊗4EXTRACT RELEVANT SUBSET⊗*
37400 This likes (70) there to be a great deal (≥50 pieces) of new information,
37500 and dislikes (-80) it if there are under a dozen such tokens.
37600 It finds and evaluates knowledge variables to constrain what should be
37700 looked at currently. This is never called in the dialog, though it was in
37800 the protocol.
37900
38000 ⊗4ANALYZE IMPLICATIONS⊗*
38100 The WHEN part is unhappy (-60) if there is usable information already,
38200 since this being is fairly costly. It also examines the size of the
38300 new info list to see just how long this search will be. The being locates
38400 the code that will be affected, predicts the affects, and sees how
38500 desirable this is. This being is also never needed to write CF.
38600
38700 ⊗4MAKE ENCODABLE⊗*
38800 If all else fails, this being tries replacing a function by one of its
38900 alternatives or, as a last resort, by one of its generalizations. This
39000 last resort is never needed to write CF.
39100
39200 ⊗4ENCODE⊗*
39300 Despite its key name, this being is mostly a bookkeeper! It runs around
39400 the assert lists, gathering up enough information to encode a specified
39500 little piece of code. A program schema is built up, instantiated as much
39600 as possible, printed out for the user to refer to, and passed to a highly
39700 optimized recursive auxilliary function, GETCODE. Some woorying about the
39800 arguments is done,including what they might be instantiated as. We inform
39900 the user of the code being written, and a prerequisite causes the function
40000 to become made into a being.
40100
40200 ⊗4STUDY TYPE⊗*
40300 This being accepts a being call, looks at that being's specializtions
40400 part, translates each entry into a decision, and dumps these onto the
40500 undeferred decision list. A deferral demon promtly attends to them.
40600
40700 ⊗4GET DATA STRUCTURE⊗*
40800 We call select-structure type, and use its advice to point to the
40900 proper knowledge variable. We eval it to set up the various parts of
41000 the data structure. In unclear cases, we may ask the user whether the
41100 argument really is a data structure. We ensure that this object is
41200 a being (else we make it one,) and we add warnings to the effect that
41300 there might not be any insertions or deletions; we'll worry about that
41400 much later, by another being. We know that the elements of a data
41500 structure are themselves usually data structures, so one the prerequisites
41600 says that we must try to make the arguments into data structures as well.
41700
41800 ⊗4GETCODE⊗*
41900 This is a long, special-case, recursive function. It goes through a piece
42000 of code with two jobs: to build up a list of arguments to this function,
42100 and to get names for specific instances of general beings.
42200 Part of the kludgy character is due to the fact that there are non-beings
42300 in the code: primitive forms (structure, tuple, vector, class, comment,
42400 atoms), primitive functions (read, null, and ,...), primitive variables
42500 (t, nil, 4, ...). By keeping closely to the theoretical ideals implicit
42600 in the ideas, this function would probably become trivial.
42700
42800 ⊗4GET NAME⊗*
42900 First we study plausible names for the new specialized being. We make the
43000 user aware of them. We examine the being name carefully, to see if it
43100 was just mentioned in the same piece of code (probably then the same name),
43200 whether it's ever been mentioned and specialized, and so on. Sometimes we
43300 end up deciding not to get a new name, sometimes we pick one and just tell
43400 the user, sometimes we recommend some old specialized name. In 90% of the
43500 cases, though, we simply say "give me a name for ...". A new-name counter
43600 is maintained to ensure unique names, and this number is appended onto the
43700 end of what the user types, uunless it's an already-existing being name.
43800 The user my type ] (and usually does), indicating that PUP is to choose.
43900 PUP always informs the user what the name is at the end.
44000
44100 ⊗4MAKE NEW BEING⊗*
44200 This being has the awesome responsibility of giving life to new beings.
44300 As is the case with neurons, soldiers, and all (good) beings, no one being
44400 really does much when you look at it in isolation. Thus this one already
44500 gets the name of the being, the name of its parent (its least general
44600 generalization), how the metacodes of these tow differ, the arglist,
44700 the reason for this specialized being to exist, what extra qualities it
44800 possesses or lacks wrt its parent, etc. It can figure out who it affects
44900 by examing its various parts, and it bases the complexity vector on
45000 that of the parent and on the changes in this new being. Thus it basically
45100 does gets, evals, and puts. It updates various assert lists, and semi-
45200 compiles the new being. One bit of knowledge says that the explicit args
45300 check may be significantly different, and the user may be queried.
45400
45500
45600
45700 ⊗5E. Low-level problem-independent knowledge: bits of programs⊗*
45800
45900 ⊗4PROPOSE PLAUSIBLE NAMES⊗*
46000 This being is called by GetName, and should incorporate a good model of user
46100 psychology of name-choosing. Currently, it applies Initials, Mainwords,
46200 Firstfew, and compositions of these, to the task description sentence.
46300
46400 ⊗4SEMI COMPILE⊗*
46500 Basically, beings only lend themselves to interpreting; to help speed up
46600 this process, this being can take advantage of regularities and simplicities
46700 in anohter being's parts, and turn out a fast function to do an equivalent
46800 job. For example, if a being doesn't enable any new demons, we needn't
46900 push the demon stack at the beginning and pop it at the end.
47000
47100 ⊗4SELECT STRUCTURE TYPE⊗*
47200 This is a true bit of programming knowledge: if the structure is composed
47300 of several weakly-interacting pieces tied together through association with
47400 one atom, then the data structure should be a list of these atoms, with
47500 the associated structures being stored on the property lists of the atoms.
47600 If there are only a couple pieces, or they interact strongly, we should
47700 use a nested list structure instead.
47800
47900 ⊗4ELEMENT⊗*
48000 All we know about an element is that it is a member of a data structure,
48100 and that we should not be ashamed to ask the user to clarify himself if
48200 he mentions this. The ConceptFormation being should note that future
48300 refernces to element actually refer to a scene, an instance of a concept,
48400 and that references to class refer to the model of a concept, a set of
48500 scenes. This may change as new data structures come into existence.
48600
48700 ⊗4MODIFY STRUCTURE⊗*
48800 Generally, we will be given a typical element, and must identify the
48900 structures it blongs to, and modify them. The precise details indicate that
49000 some subset of CONDITIONAL INSERTION, CONDITIONAL DELETION, and COMPLEX
49100 ALTERATION will be involved.
49200
49300 ⊗4GET HOLD OF⊗* by guessing and checking
49400 We must discover whether an algorithm already exists which can get arg1.
49500 If not, we try to find one which can get arg1 and some other effects. We
49600 structure the function as some of COMPUTE, GENERATE&TEST, and SEARCH.
49700 Finally, we must decide now on the type of error recovery desired: none,
49800 backtrack, or correct-and-try-to-proceed.
49900
50000 ⊗4TAKE HOLD OF⊗* trivially
50100 We examine the flow of control through the preceding code, to decide
50200 whether arg1 has ever been SET. If so, we must ask the user whether or not
50300 to read in a new value. If no read is to be done, then this being reduces
50400 to a simple assignment, or perhaps to nothing at all. Otherwise, we read
50500 in the object, and assign its various subparts to SOME PART OF it.
50600
50700 ⊗4IS OF TYPE⊗*
50800 This being is a predicate which is too low-level and general to do much.
50900 Basically, it helps formulate a question to the user, who must explain
51000 to PUP how to construct any predicate it comes across, usually just by
51100 translating a sentence the user types in.
51200
51300 ⊗4COMPLEX ALTERATION⊗*
51400 This being is always replaced by some initializing assignments, followed
51500 by calls on MODIFY STRUCTURE for some subparts of arg1. A bit of advice
51600 is that if arg1 is unstructured, try to get it structured. If this fails,
51700 maybe what is really wanted is to modify the structure of which arg1 is
51800 a member.
51900
52000 ⊗4CONDITIONAL DELETION⊗*
52100 As above, we look at the structuring of various arguments to decide what
52200 is really supposed to be deleted from where. We check with the user,
52300 remind him of various bindings relevant to the current call, and ak him
52400 to describe (1) under what conditions we do the deletion, (2) what exactly
52500 do we delete. These are translated, analyzed in the context of deletion,
52600 and help determine the deletion piece of code.
52700
52800 ⊗4CONDITIONAL INSERTION⊗*
52900 This is similar to the preceding being. Here we also worry about
53000 whether the entity to be inserted has ever been bound. If not, we must see
53100 that it is! Often, this binding will be related to the Initialize part of
53200 the DataStructure piece of the being representing the structure we're
53300 inserting into. Since some data structures have several similar but
53400 distinctly-named elements existing simultaneously, we have lots of little
53500 worries. After we have planned out the code, we check with the coding
53600 warning list and add to it; e.g., any undefined initial value of a variable
53700 in a piece of code we stuck in here, will also be uninitialized here. If
53800 we later decide never to worry about the first initialization, we must
53900 not forget this one! This is a frequent source of bugs, I think.
54000
54100 ⊗4GENERATE&TEST⊗* and ⊗4COMPUTE⊗* are not needed or implemented.
54200
54300 ⊗4SEARCH⊗*
54400 Currently, a primitive type of searching suffices. We simply execute
54500 the iteration FOREACH X IN XSET DO (TEST X). If we're unsure, we check
54600 that XSET has the form SET OF (PLURAL X). The specializations tell us to
54700 worry about termination conditions. I suppose this being could also be
54800 used for generate-and-test.
54900
55000 ⊗4FOREACH⊗*
55100 Not surprisingly, this is the iteration being. It is an NLAMBDA function,
55200 so its arguments aren't surrounded by quotes. There are various minor
55300 decisions to make, which simplify any specialized version:
55400 there may or may not be an "until" condition; we must get it and also
55500 decide what to bind the iteration variable to and what value to return,
55600 both in case this condition is met, and in case we exhaust the space.
55700 Often, we will decide to leave some of these unspecified, put notes about
55800 them on the coding warning list, and not worry about them for a long time.
55900
56000 ⊗4TEST⊗*
56100 This being is a predicate which is oriented toward a threshold of accepta-
56200 bility, whereas ISOFTYPE is oriented toward separating cases. It either
56300 has the flavor of comparing, or of competition. We must also decide
56400 whether the result is nominal, ordinal, or of ratio caliber. These latter
56500 two never crop up, which is why we assume the test is a predicate always.
56600
56700 ⊗4SOME PART OF⊗*
56800 We either get this simple structural function by examples (like SHaw's
56900 program) or by translating a user-supplied English sentence describing
57000 what it does. Typically, some combinations of CAR, CDR, CONS, and NIL.
57100
57200 ⊗4COMPARE⊗*
57300 We must worry about whether the result is nominal, ordinal, or ratio.
57400 We also decide whether the comparison is a JOINING FUNCTION applied to
57500 its arguments, a constant function (executed only for side effects),
57600 or a JOINING FUNCTION applied to the results of COMPAREing corresponding
57700 subparts of the arguments. This last case is both common and complicated.
57800 If neither argument is structured, this is impossible. If both are highly
57900 structured, their structures must have a nontrivial amount of correspondence
58000 in order to succeed. If only one argument is structured, this strongly
58100 suggest that the other one should be similarly structured. Often we go
58200 ahead and structure it without asking the user.
58300
58400 ⊗4BETTER⊗*
58500 We've discussed this earlier. Here we should note that when called on
58600 to write a new, specialized BETTER function, it chooses either a simple
58700 function (T, NIL) to allow an optimization (replace by CONS, replace by
58800 NCONC1), or it chooses a complex comparing function (e.g., alphorder,
58900 write-program itself!).
59000
59100 ⊗4JOINING FUNCTION⊗*
59200 This is a way of condensing results. It is typically a known function,
59300 asuch as AND, OR, PLUS, MAXIMUM, PROGN... which is determined by (i) the
59400 character of the result (numerical, T/F) and (ii) whether we are in
59500 a DO UNTIL loop, a DO REPEATEDLY loop, or neither of these loops.
59600
59700
59800 ⊗5F. Programming Knowledge stored in variables⊗*
59900
60000 To resolve an ADAPTATION decision
60100 Ask for sample dialog, symbolically run current code, modifying I/O formats.
60200
60300 To defer any decision whose AFFECTS part is known
60400 It may translate as some detail of x; in that case, wait until some code for
60500 x already exists. The affects part may translate as The x algorithm; in this
60600 case we worry as soon as PUP begins DO-ing any coding at all of x.
60700
60800 To defer an ALTERNATIVES decision
60900 Examine the coding tasks in all cases, and try to find some common head
61000 and/or some common tail. If there is any, try to defer the decision until
61100 after writing code for this common subtask sequence. If one alternative
61200 exactly matches the common sequence, we can temporarily assert that the
61300 decision has been made to do this simplest being.
61400
61500 To terminate an AND loop
61600 The conjunction is usually between highly similar objects. Related to how
61700 to parse a sentence containing ANDs.
61800
61900 To resolve a BOOLEAN decision
62000 Ask the user to respond Yes or No. The decision has special Yes, No, and
62100 Both parts; in each case, two of them will be evalled.
62200
62300 To resolve a DEFINITION decision
62400 Locate the defined object, reaffirm that it is undefined, ask the user to
62500 define it, check whether it is a predicate, data structure, etc., and tell
62600 the user about such constraints.
62700
62800 To resolve a DICHOTOMY decision
62900 Each such decision contains a little program which now is evaluated. If
63000 it succeeds, its value answers the decision; if it fails, we have to ask
63100 the user to choose alternative 1 or 2. The choice points to a little
63200 program to run, and its value is the desired resolution.
63300
63400 To resolve a PREDICATE decision
63500 Fix the context of the predicate call, try to relate this predicate to some
63600 known tests, and, if this fails, ask the user. His response is specially
63700 processed in the context of a predicate fixation.
63800
63900 To set up a data structure stored as a LIST STRUCTURE
64000 The initialization is a simple SETQ to an as-yet undefined value.
64100 Access is simply by mentioning the name. Insertion is by Merge:in or
64200 Merge. Deletion is by Pullout or Setdifference.
64300
64400 To set up a data structure stored as a PROPERTY LIST STRUCTURE.
64500 Delineate the pieces to be associated with each atom, and name them.
64600 Accession is via GETP. Insertion s via a GETP, then a MERGE:IN, then a
64700 PUT. Deletion is via a GETP, then a PULLOUT, then a PUT. Initialization
64800 is a PUT of a SETQ to an as-yet undefined value. Each named substructure
64900 must itself become a data structure, typically a simple list structure.
65000
65100 To resolve a SOMEOF decision
65200 The user must be sufficiently informed about the choices. He types back
65300 a non-null, ordered subset of those choices. We examine them to see if
65400 they should be enclosed in a PROGN or in a REPEATEDLY, and whether we
65500 would like to see something structured in a certain way.
65600
65700 To resolve a SUBSETOF decision
65800 Similar to above, but no fancy investigation, just string the choices
65900 together in a PROGN.
66000
66100 To defer any decision whose WHEN part is provided
66200 We transform "before deciding firmly how to ←x ←←y" into something like
66300 "member (cons detail-of-$x-ing $y) doing-pup-list". We also translate
66400 Before any ←←x routine is finalized, After ←←x routines are finalized,
66500 and similar phrases. These must always evaluate T or NIL.
66600
66700
66800 ⊗5G. Demons and functions of interest⊗*
66900
67000 LIST:JOIN, MERGE:IN, PULLOUT
67100 These rather standard functions are gicen a tiny bit of advice: if their
67200 "element" is more like a sublist than an element of their "list", then
67300 they assume that what was meant was append or setidifference, not cons
67400 or merge or remove.
67500
67600 LONG NAME DEMON
67700 If any name becomes unwieldy, he notices it and asks the user for a
67800 nickname.
67900
68000 PERMIT DETAILED DECISION
68100 Implicit near the beginning of each decision, PUP called this function.
68200 It asks the user if he wnats more details, and if so it gets them and
68300 prints them out.
68400
68500 STRUCTURE INDUCING DEMON
68600 If the object is a being already, special considerations apply.
68700 If the object and all arg values are ill-defined, we decide not to do
68800 any structuring. Otherwise, we investigate the effects of structuring
68900 the object into the pieces specified in the args. If there is no problem,
69000 and if the user consents, we tack the appropriate Replace messages onto the
69100 coding warning list (with a high priority). We activate Long Name Demon.
69200
69300 ⊗5Natural Language Translation⊗*
69400 We have already discussed the TRANSLATE being and the basic way of handling
69500 natural language input. Several beings exist primarily for this purpose;
69600 RECOGNIZE:CONDITIONAL, RECOGNIZE:CONJUNCTION, RECOGNIZE:EQUALITY,
69700 RECOGNIZE:FUNCTION:RETURNS, RECOGNIZE:INCLUSION, RECOGNIZE:LITERALS,
69800 RECOGNIZE:SET:RELATIONS, RECOGNIZE:SOME:MEMBER, ADD:DEFINITION,
69900 ADJECTIVE:HANDLER, REPEATEDLY. Also, there are several functions
70000 related to translation: e.g., UNGERUNDIFY, PLURAL, OPPOSITE.
70100 All these are straight-forward, and their task is obvious from their name.
00100
00200 .SKIP TO COLUMN 1
00300 .SELECT 1
00400
00500 ⊗5(ii) The increment of knowledge necessary to write GI⊗*
00600
00700 ⊗4APPLY RULE⊗* This BEING accepts two arguments, which must be
00800 interpretable as a string and a rule, respectively. The string is
00900 compared against the left side of the rule and, if applicable, the
01000 change indicated by the right side is executed on the string. It is
01100 immediately encodable ifthe user wishes all possible apllications of
01200 the rule to the string; else a more specialized version must be
01300 synthesized.
01400
01500 ⊗4CONSTRAIN⊗* Try first to decide if it is meaningful for the given
01600 structure to be constrained, and if so, how. Next see whether any
01700 other BEINGs can help. Finally, ask the user to specify any
01800 constraints he can think of. For example, can a list structure grow
01900 indefinitely, or can we use some fancy programming trick because the
02000 size stays small.
02100
02200 ⊗4GRAMMATICAL INFERENCE⊗* Needless to say, this is a high-level,
02300 domain-specific BEING! It must infer grammars from exemplary strings;
02400 it knows this to be the only reasonable g.i. paradigm. If it fails,
02500 some genralizations are inductive inference, enumeration,
02600 problem-solving; some alternatives are concept formation and
02700 simulated evolution. There are many minor decisions, similar to the
02800 concept formation decisions (e.g., examine a sample GI-user dialogue
02900 to finalize the printing formats). The major decision is dichotomous:
03000 whether our new specialized BEING should retain the ability to input
03100 the type of grammar and vary its strategy accordingly, or whether
03200 only one fixed type of grammar (e.g., context-free) will always be
03300 used, and may be "built in" to the code. The result of this decision
03400 is to pass control to one or the other of the following two BEINGS.
03500
03600 ⊗4INFER MULTI-CLASS GRAMMARS⊗* We read in the type of grammar, and
03700 then call the appropriate specialist for that type. Thus we have a
03800 big switch here.
03900
04000 ⊗4INFER FIXED-CLASS GRAMMARS⊗* This routine determines at
04100 program-synthesis time what the class is going to be, and thus will
04200 be a fixed call to one of the following four BEINGS. The speedups
04300 will arise from using the constraints on the rules.
04400
04500 ⊗4INFER PHRASE-STRUCTURE GRAMMARS⊗* There are no rule constraints in
04600 a type 0 grammar; each half of the rule is viewed as an arbitrary
04700 list of letters. When a TEST is indicated, the
04800 fringe-of-conciousness demon must report it is thinking of PARSE.
04900 The left and right sides of a rule will be destructive operations on
05000 the data representation of a rule.
05100
05200 ⊗4INFER CONTEXT-SENSITIVE GRAMMARS⊗* We shall only report on the
05300 differences between the INFER... BEINGS. This one knows that the
05400 right side of the rule must be at least as long as the left side.
05500 This will be used as a pruning heuristic when proposing plausible
05600 rules.
05700
05800 ⊗4INFER CONTEXT-FREE GRAMMARS⊗* Grammars of type 2 have as their left
05900 side a single nonterminal. Further simplifications can occur by only
06000 working toward a Griebach Normal Form or Chomsky Normal Form grammar,
06100 although from the standpoint of inference energy these are harder.
06200
06300 ⊗4INFER REGULAR GRAMMARS⊗* A type three grammar has a unit on the
06400 left and a pair of them for a right side (one terminal, one
06500 nonterminal). This is a very powerful pruning heuristic.
06600
06700 ⊗4MAJOR MODIFY STRUCTURE⊗* The old, simple "insert,delete,alter"
06800 paradigm of modification was no longer sufficicient. This BEING heads
06900 a whole complex of modify BEINGS, including the old one as the
07000 low-level workhorse primitive. Here is a sketch of the organization:
07100 MAJOR MODIFY STRUCTURE
07200 | | | MODIFY UNTIL | MODIFY SOME
07300 | | | THE ORIGINAL MODIFY STRUCTURE
07400 BEING So this top-level modifier calls some subset of the three lower
07500 modification BEINGS. Later, we will add a fourth alternative, EXAMINE
07600 DATA STRUCTURE, to aid in writing the AIR program.
07700
07800 ⊗4MODIFY SOME⊗* We determine a set S and a predicate P at synthesis
07900 time. At run time, we map through S and apply P; all elements
08000 responding positively are modified (using the original MODIFY
08100 STRUCTURE.) The decisions about P and S are easily deferrable.
08200
08300 ⊗4MODIFY UNTIL⊗* This BEING is simply an order to compose
08400 REPEATEDLYπ.MODIFY_STRUCTURE. The former BEING bears the brunt of
08500 the responsibility of the interface.
08600
08700 ⊗4PARSE⊗* Attempt to parse a string from the current set of rules, by
08800 reversing each rule and composing their applications. We decide at
08900 synthesis time whehter or not to maintain a parse tree, whether or
09000 not to maintain a list of the rules used during the parse, whether we
09100 stop parsing at any legal string or only at "S," whether we parse
09200 forwards or backwards or both, how deeply in each direction (this is
09300 always deferred until much later), whether one direction after
09400 another or (simulated) simultaneously. We look for the data structure
09500 part of the BEING representing a rule, and ask him about constraints
09600 on the rule, and about how to destructively recover the left and
09700 right sides separately.
09800
09900 ⊗4PARSE BACKWARD⊗* This BEING is given two strings and a set of
10000 rules; the task is to apply anti-rules to the target string until it
10100 becomes the initial string. This is typically done breadth-first.
10200 Special modifications must be made if there is a parse tree to
10300 maintain, if a set of rules used must be maintained, etc. The basic
10400 body is a nest of FOREACH calls (∀rule, ∀way of applying therule to
10500 the string, recurse). To avoid infinite recursion, we must supply a
10600 third argument, the depth to which we compose these anti-rules before
10700 we give up. When calling itself recursively, this level will
10800 bedecremented.
10900
11000 ⊗4PARSE FORWARD⊗* This BEING is analogous to the previous one, using
11100 rules themselves instead of anti-rules. Notice how clearly the place
11200 to insert searching heuristics is marked out for us (althoug none are
11300 present.)
11400
11500 ⊗4STRING⊗* This is a structure whose parts are a name, a list of
11600 letters, a set of comments. It is advisable to use list structures
11700 rather than property lists to represent strings, since they will
11800 probably only be accessed by one of their three parts. In the GI
11900 program, we don't use STRING itself, but rather we mention
12000 UNCOMMENTED STRING, which causes this BEING to create a new
12100 specialized version of itself, sans the third, comments part.
00100
00200 ⊗5(iii) The increment of knowledge necessary to write AIR⊗*
00300
00400 ⊗4EXAMINE STRUCTURE⊗* This is another one of the parts of a major
00500 MODIFY structure. If the fringe of conciousness demon can't come up
00600 with a reasonable matching function, one is selected now. The basic
00700 body says to do PATTERN MATCH, using this match function, and convey
00800 the results to the caller (who may be the user.) The inputs are thus
00900 a pattern, a data structure name, and possibly a hint to a match
01000 function.
01100
01200 ⊗4PATTERN MATCH⊗* This existed as a system function earlier, but for
01300 AIR it is necessary to write a tailored pattern-matcher. In
01400 particular, AIR demands that we strip away the common head and tail
01500 from both pattern and expression, and then compose the two remaining
01600 pieces into the left and right sides of a new plausible rule, and
01700 then check that this conforms with the constraints on rules. This is
01800 certainly different from the type of match needed by CF! Notice that
01900 we had to add the eliminate common head/tail functions to our list of
02000 system primitive functions.